[SOUND] So
we talked about a page rank as a way to
to capture the Authorities.
Now we also looked at the, some other
examples where a hub might be interesting.
So, there is another
algorithm called the HITS and
that's going to do compute the scores for
us.
Authorities & Hubs.
Intuitions of,
pages that are widely cited, good, sorry,
there is, then,
there is pages that are cited.
Many other pages are good Hubs, right?
But there, I think that the.
Most interesting idea of this
algorithm HITS is, it's going to use,
a reinforcement mechanism to kind of
help improve the scoring for
Hubs and the Authorities.
And here, so here's the idea,
it will assume that good
authorities are cited by good hubs.
That means if you're cited by
many pages with good hub scores,
then that increases your authority score.
And similarly, good hubs are those
that pointed to good authorities.
So if you get you point it to
a lot of good authority pages,
then your hub score would be increased.
So you then, you would have
iterative reinforce each other,
because you can point
it to some good hubs.
Sorry, you can point it
to some good authorities.
To get a good hub score.
Whereas those authority scores,
would be also improved,
because they are pointed to by a good hub.
And this hub is also general,
it can have many applications in graph and
network analysis.
So just briefly, here's how it works.
We first also construct the matrix, but
this time we're going to
construct the Adjacency matrix.
We're not going to normalize the values,
so if there's a link there's a y.
If there's no link that's zero.
Right again, it's the same graph and then,
we're going to define the top score of
page as a sum of the authority scores
of all the pages that it appoints to.
So whether you are hub that really depends
on whether you are pointing to a lot of,
good authority pages.
That's what it says in the first equation.
Your second equation,
will define the authority score of a page
as a sum of the hub scores
of all those pages.
That they point to, so whether you
are a good authority would depend on
whether those pages that
are pointing to you are good Hubs.
So you can see this a forms
a iterative reinforcement mechanism.
Now these two equations
can be also written.
In the matrix fo-, format.
Right, so
what we get here is then the hub vector is
equal to the product of
the Adjacency matrix.
And the authority vector.
And this is basically the first equation.
Right.
And similarly, the second equation can
be returned as the authority vector
is equal to the product of A transpose
multiplied by the hub vector.
And these are just different ways
of expressing these equations.
But what's interesting is that if
you look at to the matrix form.
You can also plug-in the authority
equation into the first one.
So if you do that, you can actually
make it limited to the authority vector
completely, and
you get the equation of only hub scores.
Right, the hub score vector is equal
to A multiplied by A transpose.
Multiplied by the hub score vector again.
And similarly we can do
a transformation to have equation for
just the authorities scores.
So although we framed the problem
as computing Hubs & Authorities,
we can actually eliminate the one of them
to obtain equation just for one of them.
Now the difference between this and
page is that, now the matrix
is actually a multiplication of the mer-,
Adjacency matrix and its transpose.
So this is different from page rank.
Right?
But mathematically then we would
be computing the same problem.
So in ha, in hits,
we're keeping would initialize the values
that state one for all these values.
And then with the algorithm will apply
these, these equations essentially and
this is equivalent if you multiply that.
By, by the matrix.
A and A transpose.
Right.
And so the arrows of these are exactly
the same in the debate rank.
But here, because the Adjacency matrix
is not normalized, so what we have to do
is to, what we have to do is after each
iteration we have to do normalize.
And this would allow us to
control the grooves of value.
Otherwise they would,
grew larger and larger.
And if we do that, and
then we will basically get a, HITS.
I was in the computer, the hub scores and
also the scores for all of the pages.
And these scores can then be used,
in ranging to start the PageRank scores.
So to summarize, in this lecture we have
seen that link information is very useful.
In particular,
the Anchor text base is very useful.
To increase the the text
representation of a page.
And we also talk about the PageRank and
HITS algorithm as two major
link analysis algorithms.
Both can generate scores for.
What pages that can be used for
the, the ranking function.
Those that PageRank and
the HITS also very general algorithms, so
they have many applications in
analyzing other graphs or networks.
[MUSIC]

